Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement API endpoint that returns TSV report about submissions that are pending review #1443

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

cristina-stonepedraza
Copy link

@cristina-stonepedraza cristina-stonepedraza commented Nov 12, 2024

Add a script to api.py and a query to crud.py to pull information from the submission portal database, and then generate a report of NMDC submissions that have been submitted as a TSV.

This PR references issue 2047 in nmdc-schema: microbiomedata/nmdc-schema#2047

Copy link
Collaborator

@eecavanna eecavanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cristina-stonepedraza, I think this looks great overall! the code looks clean to me and I had no issues following it.

I left some feedback about path, function, and variable naming, as well as one about the endpoint description. In case you have any questions about these, you can message me here/Slack/etc.

There is one more thing I want to see in this PR:

  • An automated test

I'll send you information about that on Slack. I posted information about that below.

nmdc_server/api.py Outdated Show resolved Hide resolved
nmdc_server/api.py Outdated Show resolved Hide resolved
nmdc_server/api.py Outdated Show resolved Hide resolved
nmdc_server/api.py Outdated Show resolved Hide resolved
nmdc_server/api.py Outdated Show resolved Hide resolved
nmdc_server/crud.py Show resolved Hide resolved
nmdc_server/crud.py Outdated Show resolved Hide resolved
@eecavanna eecavanna changed the title 2047 mixs endpoint Implement API endpoint that returns TSV report about submissions that are pending review Nov 14, 2024
@eecavanna
Copy link
Collaborator

Hi @cristina-stonepedraza, here's an example of an automated test that targets an endpoint like this one:

def test_get_metadata_submissions_report_as_non_admin(
db: Session, client: TestClient, logged_in_user
):
response = client.request(method="GET", url="/api/metadata_submission/report")
assert response.status_code == 403
def test_get_metadata_submissions_report_as_admin(
db: Session, client: TestClient, logged_in_admin_user
):
now = datetime.utcnow()
# Create two submissions, only one of which is owned by the logged-in user.
logged_in_user = logged_in_admin_user # allows us to reuse some code snippets
submission = fakes.MetadataSubmissionFactory(
author=logged_in_user,
author_orcid=logged_in_user.orcid,
created=now,
)
fakes.SubmissionRoleFactory(
submission=submission,
submission_id=submission.id,
user_orcid=logged_in_user.orcid,
role=SubmissionEditorRole.owner,
)
other_user = fakes.UserFactory()
other_submission = fakes.MetadataSubmissionFactory(
author=other_user,
author_orcid=other_user.orcid,
created=now + timedelta(seconds=1),
metadata_submission={
"studyForm": {
"studyName": "My study name",
"piName": "My PI name",
"piEmail": "My PI email",
},
},
status="in-progress",
source_client="field_notes",
)
db.commit()
response = client.request(method="GET", url="/api/metadata_submission/report")
assert response.status_code == 200
# Confirm the response payload is a TSV file having the fields and values we expect;
# i.e. below its header row, it has two data rows, each representing a submission,
# ordered from most recently-created to least recently-created.
# Reference: https://docs.python.org/3/library/csv.html#csv.DictReader
fieldnames = [
"Submission ID",
"Author ORCID",
"Author Name",
"Study Name",
"PI Name",
"PI Email",
"Source Client",
"Status",
]
reader = DictReader(response.text.splitlines(), fieldnames=fieldnames, delimiter="\t")
rows = [row for row in reader]
assert len(rows) == 3 # includes the header row
header_row = rows[0] # gets the header row
assert len(list(header_row.keys())) == len(fieldnames)
data_row = rows[1] # gets the first data row (the most recently-created submission)
assert data_row["Submission ID"] == str(other_submission.id)
assert data_row["Author ORCID"] == other_user.orcid
assert data_row["Author Name"] == other_user.name
assert data_row["Study Name"] == "My study name"
assert data_row["PI Name"] == "My PI name"
assert data_row["PI Email"] == "My PI email"
assert data_row["Source Client"] == "field_notes"
assert data_row["Status"] == "in-progress"
data_row = rows[2] # gets the second data row
assert data_row["Submission ID"] == str(submission.id)
assert data_row["Author ORCID"] == logged_in_user.orcid
assert data_row["Author Name"] == logged_in_user.name
assert data_row["Study Name"] == ""
assert data_row["PI Name"] == ""
assert data_row["PI Email"] == ""
assert data_row["Source Client"] == "" # upstream faker lacks `source_client` attribute
assert data_row["Status"] == "In Progress" # matches value in upstream faker

There are actually two tests there because the endpoint in question is only accessible to admins. One of the tests focuses on that aspect (i.e. security).

By the way, I do wonder whether we will restrict access to this reporting endpoint also. You/we can discuss that with @mslarae13.

@mslarae13
Copy link

By the way, I do wonder whether we will restrict access to this reporting endpoint also. You/we can discuss that with @mslarae13.

@eecavanna I assume the submission status end point that I use for quarterly metrics is limited, yes? I think this should be the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants